1 Data preparation and descriptive statistics

We first run the prep script by Furia loading the TOPLAS data (Berger et al. 2019).

source("utils.R")
setup.data() 
toplas.data <- load.TOPLAS(cleanup=TRUE)
## Assertion:  nrow(data) == 1481307 :  pass 
## Assertion:  nrow(data) == 1481307 :  pass
# Only 16 languages now
toplas.languages <- levels(toplas.data$language[1])
toplas.data <- by.project.language(toplas.data)

Manually do a \(\log\) transformation:

toplas.data$commits_log <- log(toplas.data$commits)
toplas.data$insertions_log <- log(toplas.data$insertions)
toplas.data$max_commit_age_log <- log(toplas.data$max_commit_age)
toplas.data$devs_log <- log(toplas.data$devs)
toplas.data$insertions_s <- scale(toplas.data$insertions)

Finally, remove columns not needed and add a column that represents the ‘project_id’ as a numeric.

toplas.data = subset(toplas.data, select = -c(domain))
toplas.data$project_id <- as.integer(toplas.data$project)

Our list now looks like this, where ‘n_bugs’ (\(\mathbb{N}^+\)) is our outcome variable and ‘language_id’ (\(\mathbb{N}^+\)), ‘commits’ (\(\mathbb{R}^+\)), ‘max_commit_age’ (\(\mathbb{R}^+\)), and ‘devs’ (\(\mathbb{N}^+\)) are our potential predictors.

glimpse(toplas.data)
## Rows: 1,039
## Columns: 14
## Groups: project [708]
## $ project            <fct> _s, 4clojure, 4clojure, accelerate, ack, ActionBar…
## $ language           <fct> Php, Clojure, Javascript, Haskell, Perl, Java, Jav…
## $ commits            <int> 174, 629, 75, 985, 97, 213, 796, 39, 64, 1404, 42,…
## $ insertions         <int> 3232, 9180, 39240, 99003, 425, 14657, 225043, 1628…
## $ max_commit_age     <int> 696, 774, 509, 1644, 2031, 192, 874, 1358, 1297, 1…
## $ n_bugs             <int> 0, 128, 29, 165, 12, 68, 157, 11, 9, 397, 19, 71, …
## $ devs               <int> 40, 21, 10, 17, 6, 5, 42, 13, 9, 170, 6, 35, 24, 1…
## $ language_id        <dbl> 13, 4, 10, 8, 12, 9, 9, 5, 10, 15, 13, 8, 11, 11, …
## $ commits_log        <dbl> 5.159055, 6.444131, 4.317488, 6.892642, 4.574711, …
## $ insertions_log     <dbl> 8.080856, 9.124782, 10.577452, 11.502905, 6.052089…
## $ max_commit_age_log <dbl> 6.545350, 6.651572, 6.232448, 7.404888, 7.616284, …
## $ devs_log           <dbl> 3.6888795, 3.0445224, 2.3025851, 2.8332133, 1.7917…
## $ insertions_s       <dbl[,1]> <matrix[26 x 1]>
## $ project_id         <int> 1, 2, 2, 3, 4, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12, 13…

and we have no NAs nor zero-inflation in the outcome variable:

table(is.na(toplas.data))
## 
## FALSE 
## 14546
table(toplas.data$n_bugs == 0)
## 
## FALSE  TRUE 
##  1037     2

Since n_bugs \(\in \mathbb{N}^+\) we would expect to use a Poisson(\(\lambda\)) likelihood. If we, however, look at some more descriptive statistics of ‘n_bugs’ we see that there is a large difference between the mean and the variance,

summary(toplas.data$n_bugs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0      21      65     512     214  121302
var(toplas.data$n_bugs)
## [1] 16245308

clearly indicating that we need to model the variance separately (each Poisson count observation should have its own rate). Hence, we’ll assume that the underlying data-generative process approximately follows a negative binomial distribution \(\mathrm{NB}(\lambda,\phi)\).1

2 Initial model development and out of sample comparisons

We design a number of models and then conduct out of sample prediction comparisons using PSIS-LOO (Vehtari, Gelman, and Gabry 2017). Our models, \(\mathcal{M}_1,\ldots,\mathcal{M}_3\), become more complex at each step. We start by designing a simple varying intercepts model (varying according to ‘language_id’). For this and the following models we’ll set weakly regularizing priors (which we’ll come back to later when conducting prior predictive checks).

In \(\mathcal{M}_2\) we add predictors as population-level effects. Finally, in \(\mathcal{M}_3\) we use varying intercepts (‘language_id’) and varying slopes (for each population-level effect). Additionally we add another varying intercept (according to ‘project_id’).

m1 <- brm(n_bugs ~ 1 + (1 | language_id),
          family = negbinomial(),
          data=toplas.data,
          prior = p)
m2 <- brm(n_bugs ~ 1 + devs_log + max_commit_age_log + commits_log + 
            insertions_s + (1 | language_id),
          family = negbinomial(),
          data = toplas.data,
          prior = p)
m3 <- brm(n_bugs ~ 1 + devs_log + max_commit_age_log + commits_log + 
            insertions_s +
            (1 + devs_log + max_commit_age_log + commits_log + 
               insertions_s | language_id) + 
            (1 | project_id),
          family = negbinomial,
          data = toplas.data,
          prior = p)

Comparing out of sample prediction capabilities of the models, we see that \(\mathcal{M}_7\) takes the lead.

elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo looic se_looic
m3 0.00 0.00 -4836.74 55.50 419.42 16.42 9673.48 111.00
m2 -193.32 23.42 -5030.06 57.89 22.07 2.73 10060.13 115.78
m1 -1911.49 51.89 -6748.23 72.66 49.23 16.35 13496.46 145.32

Let us focus on the top-two highest ranked models, \(\mathcal{M}_3\) and \(\mathcal{M}_2\). \(\mathcal{M}_2\) has \(\textrm{elpd_diff}=\) -193.32 and \(\textrm{se_diff}=\) 23.42, hence with \(z_{99\%}=2.576\), we have [-253.66, -132.99]. Since zero is not in the interval one could make a convincing claim that \(\mathcal{M}_3\) is significantly better (generally we want a model to be \(2\)\(4\) SE away).

In this case we’re after the best out of sample prediction and we have strong indications that \(\mathcal{M}_3\) is ‘better,’ relatively speaking. Let us set \(\mathcal{M}_3\) as our target model \(\mathcal{M}\).

M <- m3

3 Prior predictive checks

For our target model \(\mathcal{M}\) we have a wide prior for our intercept, \(\alpha\) (i.e., \(\mathrm{N}(0,5)\). A, virtually, flat prior for our correlation matrix, \(\mathcal{L}\) (i.e., \(\mathrm{LKJ}(2)\)). Wide priors for our \(\sigma\), standard deviation, which models the group-level (‘random’) effects (i.e., \(\mathrm{Weibull}(2,1)\). Finally, we have a default prior of \(\gamma(0.01,0.01)\) for the shape parameter \(\phi\).

\[ \begin{eqnarray} \alpha & \sim & \textrm{Normal}(0,5)\\ \beta_1,\ldots,\beta_4 & \sim & \textrm{Normal}(0,0.5)\\ \mathcal{L} & \sim & \textrm{LKJ}(2)\\ \sigma & \sim & \textrm{Weibull}(2,1)\\ \phi & \sim & \gamma(0.01,0.01) \end{eqnarray} \]

Let’s sample only from priors.

M_priors <- brm(n_bugs ~ 1 + devs_log + max_commit_age_log + commits_log + insertions_s +
            (1 + devs_log + max_commit_age_log + commits_log + insertions_s | language_id) + 
            (1 | project_id),
          family = negbinomial, 
          data = toplas.data, 
          prior = p, 
          sample_prior = "only")
Dashed vertical line is max value, while dark blue line is our data and light blue lines are draws from the priors

Figure 3.1: Dashed vertical line is max value, while dark blue line is our data and light blue lines are draws from the priors

If we plot the data, \(y\), (dashed line is the maximum value of ‘n_bugs’ in our data), together with output from our model using priors only, \(y_{\mathrm{rep}}\) we see that with our priors we’ll still allow extreme values on the outcome scale.

4 Posterior predictive checks

Let’s now conduct posterior predictive checks, i.e., see how well our model fits our data (diagnostics have been checked, but we do not account for them here to save space; for more information please see previous analysis).

Plot the four chains for each parameter that was sampled to check for the characteristic “fat hairy caterpillar” plots,

plot(M)

If we plot our kernel density estimates from our model with our empirical data we get a quick visual of the fit.

pp_check(M) + 
  scale_x_continuous(trans = "log2")
Kernel density estimates from our model, $N=10$, with the distribution of our original data in a darker shade

Figure 4.1: Kernel density estimates from our model, \(N=10\), with the distribution of our original data in a darker shade

A violin plot summarizes the fit for each language,

We can also check the conditional effects on our population-level parameters (see margin figures).

How do our population effects look like?2 (I will refrain from printing the varying effects since it will take up a lot of space.)

round(fixef(M), 2)
##                    Estimate Est.Error  Q2.5 Q97.5
## Intercept             -1.97      0.13 -2.22 -1.73
## devs_log               0.07      0.02  0.04  0.11
## max_commit_age_log     0.07      0.02  0.03  0.11
## commits_log            0.99      0.01  0.97  1.02
## insertions_s           0.01      0.03 -0.04  0.09

Let’s next plot our varying effects for ‘language’ (varying according to our slopes ‘devs,’ ‘max_commit_age,’ ‘commits,’ and ‘insertions’).

4.1 Predictions

The plots below shows posterior predictions of ‘n_bugs’ when our covariates are set to their empirical median and minimum levels (setting to maximum levels would increase uncertainty for a number of languages, i.e., Scale, Clojure, Perl, Go, Coffeescript, Haskell, and Python).

What is clearly evident is that the order (languages are plotted in a descending order from left to right) changes depending on covariates’ settings.

Which of the above two plots tells the truth? Well, the truth is connected to your reality, i.e., practical significance.

Let’s assume that we have a project with \(30\) developers where we have previous data in our company showing \(1500\) commits, a max commit age of \(1000\), with approximately \(8000\) insertions. And you can only choose between two languages: Python or Ruby.

If we look at the plot above, all things being equal, you should probably consider using Ruby in your project.

But do we have a ‘significant difference’ between these two languages, or for that matter among all languages?

4.2 Effect sizes

First, let’s conduct posterior predictive checks on each language. We’ll use covariates’ values from the sample as input, with the idea that the sample is representative of the population.

Next, we calculate the contrast between each language (120 combinations when we have 16 languages). Then it’s very easy to check the distribution of the difference between any two languages and in that way also investigate effect sizes (there’s no need to look at point estimates since we have a probability distribution of the differences between two languages).3

If we plot the difference between C# and C we see that it’s positive, indicating that C almost always performs worse than C# (if it would have been negative then C# would be worse).

Let’s also look at a case when there’s no significant difference (Coffeescript vs. Go) and when there is a clear difference in the other direction (Objective-C vs. Ruby), i.e., when the values are clearly negative and, so, Objective-C is clearly worse.

Once again, these are the distributions in differences we get when we reuse the data from our sample and average over them. As we saw previously, with other covariates we get other outcomes.

In the final analysis we’ll look briefly at the differences the two models display, and how much the varying effects play a role in predictions. Finally, we’ll also do simulations using one of the models, because that’s where the rubber hits the road and all that we’ve done can come to some use (except for trying to answer a question which really isn’t interesting, i.e., generally speaking which language is the ‘best’).

5 Bayesian variable selection for \(\textrm{NB}(\lambda,\phi)\)

We’ll conduct variable selection for negative-binomial (NB). Unfortunately, the Stan team does not yet (January, 2020) support variable selection for the NB.4 However, there’s a package that uses MCMC for variable selection of NB distributions (Dvorzak and Wagner 2016).

Set our outcome variable:

y <- toplas.data$n_bugs

Select response variables and store them as a matrix. We remove ‘n_bugs’ (our \(y\)), original variables not transformed, and the factor variables ‘language’ and ‘project’ (since they are indicators we’ll use later as varying intercepts).

X <- as.matrix(toplas.data[,c(-1:-8, -13)])

Run the variable selection,

varsel_res <- negbinBvs(y = y, X = X)

and plot the chain and output the summary:

plot(varsel_res, burnin = FALSE, thin = FALSE)

summary(varsel_res)
## Bayesian variable selection for the negative binomial model:
## 
## Call:
## negbinBvs(y = y, X = X)
## 
## 
## MCMC:
## M = 8000 draws after a burn-in of 2000
## BVS started after 1000 iterations
## Thinning parameter: 1
## 
## Acceptance rate for rho:
## 71.73%
## 
## Prior: spike-and-slab prior with Student-t slab [V=5]
## 
## b0[0] b0[1] b0[2] b0[3] b0[4] b0[5] 
##     0     0     0     0     0     0 
## w[a] w[b]   c0   C0 
##    1    1    2    1 
## 
## 
## Model averaged posterior means, estimated posterior inclusion
##  probabilities and 95%-HPD intervals:
## 
##             Estimate P(.=1) 95%-HPD[l] 95%-HPD[u]
## (Intercept)   -2.268     NA     -2.491     -2.075
## beta.1         0.929  1.000      0.901      0.958
## beta.2         0.065  1.000      0.043      0.083
## beta.3         0.078  1.000      0.044      0.107
## beta.4         0.078  1.000      0.052      0.102
## beta.5         0.000  0.002      0.000      0.000
## rho            7.221     NA      6.432      7.908

In short, compared to our analysis of the FSE data set, where we wanted to exclude one variable, here all variables should be included.

6 Computational environment

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.0.3 (2020-10-10)
##  os       macOS Big Sur 10.16         
##  system   x86_64, darwin17.0          
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  ctype    en_US.UTF-8                 
##  tz       Europe/Stockholm            
##  date     2021-01-27                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package        * version     date       lib
##  abind            1.4-5       2016-07-21 [1]
##  arrayhelpers     1.1-0       2020-02-04 [1]
##  assertthat       0.2.1       2019-03-21 [1]
##  backports        1.2.1       2020-12-09 [1]
##  base64enc        0.1-3       2015-07-28 [1]
##  bayesplot      * 1.8.0       2021-01-10 [1]
##  bookdown         0.21        2020-10-13 [1]
##  boot             1.3-26      2021-01-25 [1]
##  bridgesampling   1.0-0       2020-02-26 [1]
##  brms           * 2.14.6      2021-01-23 [1]
##  Brobdingnag      1.2-6       2018-08-13 [1]
##  callr            3.5.1       2020-10-13 [1]
##  cli              2.2.0       2020-11-20 [1]
##  coda             0.19-4      2020-09-30 [1]
##  codetools        0.2-18      2020-11-04 [1]
##  colorspace       2.0-0       2020-11-11 [1]
##  colourpicker     1.1.0       2020-09-14 [1]
##  crayon           1.3.4       2017-09-16 [1]
##  crosstalk        1.1.1       2021-01-12 [1]
##  curl             4.3         2019-12-02 [1]
##  DBI              1.1.1       2021-01-15 [1]
##  desc             1.2.0       2018-05-01 [1]
##  devtools         2.3.2       2020-09-18 [1]
##  digest           0.6.27      2020-10-24 [1]
##  distributional   0.2.1       2020-10-06 [1]
##  dplyr          * 1.0.3       2021-01-15 [1]
##  DT               0.17        2021-01-06 [1]
##  dygraphs         1.1.1.6     2018-07-11 [1]
##  ellipsis         0.3.1       2020-05-15 [1]
##  evaluate         0.14        2019-05-28 [1]
##  fansi            0.4.2       2021-01-15 [1]
##  farver           2.0.3       2020-01-16 [1]
##  fastmap          1.0.1       2019-10-08 [1]
##  forcats        * 0.5.0       2020-03-01 [1]
##  fs               1.5.0       2020-07-31 [1]
##  gamm4            0.2-6       2020-04-03 [1]
##  generics         0.1.0       2020-10-31 [1]
##  ggdist           2.4.0       2021-01-04 [1]
##  ggplot2        * 3.3.3       2020-12-30 [1]
##  ggridges       * 0.5.3       2021-01-08 [1]
##  ggthemes       * 4.2.4       2021-01-20 [1]
##  glue             1.4.2       2020-08-27 [1]
##  gridExtra        2.3         2017-09-09 [1]
##  gtable           0.3.0       2019-03-25 [1]
##  gtools           3.8.2       2020-03-31 [1]
##  highr            0.8         2019-03-20 [1]
##  htmltools        0.5.1.1     2021-01-22 [1]
##  htmlwidgets      1.5.3       2020-12-10 [1]
##  httpuv           1.5.5       2021-01-13 [1]
##  httr             1.4.2       2020-07-20 [1]
##  igraph           1.2.6       2020-10-06 [1]
##  inline           0.3.17      2020-12-01 [1]
##  jsonlite         1.7.2       2020-12-09 [1]
##  kableExtra     * 1.3.1       2020-10-22 [1]
##  knitr            1.30        2020-09-22 [1]
##  labeling         0.4.2       2020-10-20 [1]
##  LaplacesDemon  * 16.1.4      2020-02-06 [1]
##  later            1.1.0.1     2020-06-05 [1]
##  latex2exp      * 0.4.0       2015-11-30 [1]
##  lattice          0.20-41     2020-04-02 [1]
##  lifecycle        0.2.0       2020-03-06 [1]
##  lme4             1.1-26      2020-12-01 [1]
##  loo              2.4.1       2020-12-09 [1]
##  magrittr         2.0.1       2020-11-17 [1]
##  markdown         1.1         2019-08-07 [1]
##  MASS             7.3-53      2020-09-09 [1]
##  Matrix           1.3-2       2021-01-06 [1]
##  matrixStats      0.57.0      2020-09-25 [1]
##  memoise          1.1.0       2017-04-21 [1]
##  mgcv             1.8-33      2020-08-27 [1]
##  mime             0.9         2020-02-04 [1]
##  miniUI           0.1.1.1     2018-05-18 [1]
##  minqa            1.2.4       2014-10-09 [1]
##  munsell          0.5.0       2018-06-12 [1]
##  mvtnorm          1.1-1       2020-06-09 [1]
##  nlme             3.1-151     2020-12-10 [1]
##  nloptr           1.2.2.2     2020-07-02 [1]
##  patchwork      * 1.1.1       2020-12-17 [1]
##  pillar           1.4.7       2020-11-20 [1]
##  pkgbuild         1.2.0       2020-12-15 [1]
##  pkgconfig        2.0.3       2019-09-22 [1]
##  pkgload          1.1.0       2020-05-29 [1]
##  plyr             1.8.6       2020-03-03 [1]
##  pogit          * 1.2.0       2019-01-17 [1]
##  prettyunits      1.1.1       2020-01-24 [1]
##  processx         3.4.5       2020-11-30 [1]
##  projpred         2.0.2       2021-01-23 [1]
##  promises         1.1.1       2020-06-09 [1]
##  ps               1.5.0       2020-12-05 [1]
##  purrr            0.3.4       2020-04-17 [1]
##  R6               2.5.0       2020-10-28 [1]
##  Rcpp           * 1.0.6       2021-01-15 [1]
##  RcppParallel     5.0.2       2020-06-24 [1]
##  remotes          2.2.0       2020-07-21 [1]
##  reshape2         1.4.4       2020-04-09 [1]
##  rethinking     * 2.13        2020-11-21 [1]
##  rlang            0.4.10      2020-12-30 [1]
##  rmarkdown        2.6         2020-12-14 [1]
##  rprojroot        2.0.2       2020-11-15 [1]
##  rsconnect        0.8.16      2019-12-13 [1]
##  rstan          * 2.26.0.9000 2021-01-21 [1]
##  rstantools       2.1.1       2020-07-06 [1]
##  rstudioapi       0.13        2020-11-12 [1]
##  rvest            0.3.6       2020-07-25 [1]
##  scales           1.1.1       2020-05-11 [1]
##  sessioninfo      1.1.1       2018-11-05 [1]
##  shape            1.4.5       2020-09-13 [1]
##  shiny            1.6.0       2021-01-25 [1]
##  shinyjs          2.0.0       2020-09-09 [1]
##  shinystan        2.5.0       2018-05-01 [1]
##  shinythemes      1.2.0       2021-01-25 [1]
##  StanHeaders    * 2.26.0.9000 2021-01-21 [1]
##  statmod          1.4.35      2020-10-19 [1]
##  stringi          1.5.3       2020-09-09 [1]
##  stringr          1.4.0       2019-02-10 [1]
##  svUnit           1.0.3       2020-04-20 [1]
##  testthat         3.0.1       2020-12-17 [1]
##  threejs          0.3.3       2020-01-21 [1]
##  tibble           3.0.5       2021-01-15 [1]
##  tidybayes      * 2.3.1       2020-11-02 [1]
##  tidyr          * 1.1.2       2020-08-27 [1]
##  tidyselect       1.1.0       2020-05-11 [1]
##  tufte          * 0.9         2020-12-02 [1]
##  usethis          2.0.0       2020-12-10 [1]
##  utf8             1.1.4       2018-05-24 [1]
##  V8               3.4.0       2020-11-04 [1]
##  vctrs            0.3.6       2020-12-17 [1]
##  viridisLite      0.3.0       2018-02-01 [1]
##  webshot          0.5.2       2019-11-22 [1]
##  withr            2.4.0       2021-01-16 [1]
##  xfun             0.20        2021-01-06 [1]
##  xml2             1.3.2       2020-04-23 [1]
##  xtable           1.8-4       2019-04-21 [1]
##  xts              0.12.1      2020-09-09 [1]
##  yaml             2.2.1       2020-02-01 [1]
##  zoo              1.8-8       2020-05-02 [1]
##  source                                
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  Github (paul-buerkner/brms@2f0d3bb)   
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.1)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.1)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  Github (stan-dev/projpred@dc1c3e4)    
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  Github (rmcelreath/rethinking@3b48ec8)
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  local                                 
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  local                                 
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.1)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.3)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
##  CRAN (R 4.0.2)                        
## 
## [1] /Users/torkarr/Library/R/4.0/library
## [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

References

Berger, E. D., C. Hollenbeck, P. Maj, O. Vitek, and J. Vitek. 2019. “On the Impact of Programming Languages on Code Quality: A Reproduction Study.” ACM Trans. Program. Lang. Syst. 41 (4): 21:1–24. https://doi.org/10.1145/3340571.
Dvorzak, M., and H. Wagner. 2016. “Sparse Bayesian Modelling of Underreported Count Data.” Statistical Modelling 16 (1): 24–46. https://doi.org/10.1177/1471082x15588398.
Vehtari, A., A. Gelman, and J. Gabry. 2017. “Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC.” Statistics and Computing 27: 1413–32. https://doi.org/10.1007/s11222-016-9696-4.

  1. Wikipedia entry for the negative binomial distribution↩︎

  2. For an excellent introduction to these terms please see this post↩︎

  3. with 95% credible intervals↩︎

  4. Projection predictive variable selection↩︎